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AMENDMENTS TO THE DRAWINGS 

Please insert the attached 14 Replacement Sheets of the drawings in place of 
the originally-filed drawings. Also attached is 39 Annotated Sheets of drawings showing 
the revisions required to delete Figure 5a and Figure 5b, as suggested by the Examiner. 
No new matter has been added. 
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REMARKS 

Reconsideration is requested. 

An interview with the Examiner is arranged for January 22, 2008. The present 
Amendment and attachments are submitted in advance of the interview to advance 
prosecution. 

Claims 1-23 have been canceled, without prejudice. Claims 24-47 are pending. 
Claims 43-46 have been withdrawn from consideration. 

The specification, including the Figures, have been amended to delete Figures 
5a and 5b, to obviate the objection to the drawings. Attached are 14 Replacement 
Sheets of the drawings to be inserted in place of the originally-filed drawings. Also 
attached is 39 Annotated Sheets of drawings showing the revisions required to delete 
Figure 5a and Figure 5b, as suggested by the Examiner. No new matter has been 
added. Withdrawal of the objection to the drawings is requested. 

Claims 34 and 47 have been revised according to the Examiner's helpful 
suggestions stated in §10, spanning pages 5-6 of the Office Action dated October 18, 
2007. Withdrawal of the objections to claims 34 and 47 is requested. 

The Section 112, first paragraph "enablement", rejection of claim 32 is traversed. 
Reconsideration and withdrawal of the rejection are requested in view of the following. 

The Examiner is understood to believe that the applicants must "deposit" the 

origins of transfer RP4, pTiC58, F, RSF1010, ColE1 and R6K(a) recited in the claim as 

the Examiner believes that 

"It is not clear that the entities represented by the said 
abbreviations are known plasmids or polynucleotides that 
are readily available to the public." See page 6 of the Office 
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Action dated October 18, 2007. 
The following accession numbers however were obtained from an internet 
search, from a publicly available database, with a search of the indicated origins of 
transfer: 



pTiC58 



LOCUS 
2003 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
JOURNAL 



ATUORITA 



657 



Agrobacterium tumefaciens Ti 
transfer inverted repeat. 

M95646 

M95646.1 Gl: 938314 
inverted repeat; origin nick 



3 . Bacterid. 174 (19), 6238-6246 (1992) 



bp dna linear BCT 25-nov- 
plasmid pTiC58 origin of conjugal 

region; origin of conjugal transfer. 



RP4 



LOCUS 
2006 

DEFINITION 
ACCESSION 
VERSION 
JOURNAL 



X14165 



99 bp 



DNA 



linear BCT 14-NOV- 



Plasmid RP4 oriT relaxation region. 
X14165 

X14165.1 Gl:45790 

Biochim. Biophys. Acta 951 (2-3), 365-374 (1988) 



F 



LOCUS AP001918 290 bp dna linear BCT 28-N0V- 

2007 

definition Plasmid F genomic dna, complete sequence. 

ACCESSION AP001918 REGION: 66118.. 66407 

VERSION AP001918.1 Gl: 8918823 

authors Helsberg,M. and Eichenlaub, R. 

title Twelve 43-base-pair repeats map in a cis-acting region essential 
for partition of plasmid mini-F 



RSF1010 



LOCUS 
2006 

DEFINITION 

ACCESSION 
VERSION 

AUTHORS 

TITLE 



X04830 



157 bp 



DNA 



linear BCT 14-NOV- 



Escherichia coli plasmid RSF1010 mobilization genes mobA, mobB, 
mobC . 

X04 63Q REGION: 333. .489 
X04830.1 (31:42531 

Derbyshire, K .M. , Hatfull,G. and Willetts,N. 

Mobilization of the non-con j ugative plasmid RSF1010: a genetic 



and DNA sequence analysis of the mobilization region 
JOURNAL Mol. Gen. Genet. 206 (1), 161-168 (1987) 



R6K(a) 

LOCUS X05644 639 bp dna linear BCT 20-mar- 

1993 
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definition E. col i pi asmi d R6K alpha origin region. 

ACCESSION X05644 

VERSION X0 5 644.1 GI : 42652 

AUTHORS Shaff erman ,A. , Flashner,Y. , Hertman,l. and Lion,M. 

TITLE identification and characterization of the functional alpha 
origin of DNA replication of the r6k plasmid and its relatedness to the 
R6K beta and gamma origins 

JOURNAL Mol. Gen. Genet. 208 (1-2), 263-270 (1987) 



ColE1 



LOCUS 
2002 

DEFINITION 

ACCESSION 

VERSION 

AUTHORS 

TITLE 

JOURNAL 
AUTHORS 

TITLE 



JOURNAL 

AUTHORS 

TITLE 

JOURNAL 

AUTHORS 

TITLE 

JOURNAL 
AUTHORS 
TITLE 

JOURNAL 



CE1CG13 



2 bp 



DNA 



linear BCT 19-NOV- 



Plasmid ColEl, complete genome. 
JQ1566 REGION: 1465.. 1466 
J01566.1 GI : 144307 
Bastia, D . 

Determination of restriction sites and the nucleotide sequence 
surrounding the relaxation site of ColEl 
J. Mol. Biol. 124 (4), 601-639 (1978) 

Oka, A. , Nomura, N. , Morita,M. , Sugisaki,H. , Sugimoto,K. and 
Takanami, M. 

Nucleotide sequence of small ColEl derivatives: structure of the 
regions essential for autonomous replication and colicin El 
immunity 

Mol. Gen. Genet. 172 (2), 151-159 (1979) 

Chan, P. T., Ohmori,H., Tomizawa,J. and Lebowitz,J. 

Nucleotide sequence and gene organization of ColEl DNA 

J. Biol. Chem. 260 (15), 8925-8935 (1985) 

Stirling, C . J . , Szatmari,G., Stewart, G., Smith, M.C. and 

Sherratt, D.J. 

The arginine repressor is essential for plasmid-stabilizing 
site-specific recombination at the ColEl cer locus 
EMBO J. 7 (13), 4389-4395 (1988) 
Inoue,N. and Uchida,H. 

Transcription and initiation of ColEl DNA replication in 
Escherichia coli K-12 

J. Bacteriol. 173 (3), 1208-1214 (1991) 



The above is believed to be persuasive evidence that the recited sequences are 
known and available to those of ordinary skill in the art. A further "deposit" of the 
sequences is not believed to be required by the applicants. Withdrawal of the Section 
112, first paragraph "enablement", rejection of claim 32 is requested. 

To the extent not obviated by the above amendments, the Section 112, second 
paragraph, rejection of claims 28-38, is traversed. Reconsideration and withdrawal of 
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the rejection are requested in view of the above and the following comments. 

The revisions to claims 28 and 29 are believed to obviate the basis for the 
rejection stated in §14 A) on page 8 of the Office Action dated October 18, 2007. The 
revisions to claim 29 are believed to obviate the basis for the rejection stated in §14.B.) 
on page 8 of the Office Action dated October 18, 2007. 

Consideration of the following is requested with regard to the basis stated 
§14.C.) on pages 8-9 of the Office Action dated October 18, 2007. The applicants 
submit that one of ordinary skill in the art, from the specification as well as the generally 
advanced level of skill in the art, that an origin of transfer functional in a selected host, 
according to claim 30, is an origin of transfer allowing DNA transfer between donor and 
recipient strains. The specification describes the same, for example, on pages 14-15 of 
the attached substitute specification, with citation to Guiney, D.G., Yakobson, E., 
(1983). Location and nucleotide sequence of the transfer origin of the broad host range 
plasmid RK2. Proc Natl Acad Sci USA, 80: 3595-8. and Zechner, E.L., de la Cruz, F., 
Eisenbrandt, R., Grahn, A.M., Koraimann, G., Lanka, E., Muth, G., Pansegrau, W., 
Thomas, CM., Wilkins, B.M. and Zatyka, M. (2000) Conjugative-DNA transfer process. 
In Thomas, CM. (ed.) The horizontal gene pool. Harwood Academic Publishers. The 
fact that origins of transfer functional in hosts is text book material, such as the cited 
Zechner, et al. reference is evidence the field is advanced and one of ordinary skill will 
appreciate the metes and bounds of the recited origin of transfer functional in a selected 
host of claim 30, without requiring more. 

For similar reasons, the applicants believe that one of ordinary skill in the art will 
appreciate the metes and bounds of the origins of transfer is functional E. coli host cells, 
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as recited in claim 31 . The specification exemplifies the invention and the level of skill 
in the art with regard to functional origins of transfer are well known. 

Similarly, the applicants believe that one of ordinary skill in the art will be able to 
select and confirm functional integrases and transcriptional promoters within the metes 
and bounds of claims 33 and 35. 

Claims 30, 31, 33 and 35 are submitted to be definite. 

The recitations of claim 32 are submitted to be definite. As demonstrated above, 
the terms objected-to by the Examiner are definite. 

Claims 37 and 38 have been revised in response to the Examiner's comments in 
§14.E.) on page 10 of the Office Action dated October 18, 2007. 

The claims are submitted to be definite. Withdrawal of the Section 112, second 
paragraph, rejection of claims 28-38 is requested. 

To the extent not obviated by the above amendments, the Section 102 rejection 
of claims 24-29, 35, 39, 41 , 42 and 47 over RONDON et al. ("Cloning the Soil 
Metagenome: A Strategy for Accessing the Genetic and Functional Diversity of 
Unclutured Microorganisms", Applied and Enviromental Microbiology, Washington, DC, 
US, Vol. 66, No. 6, June 2000, Pgs. 2541-2547) is traversed. Reconsideration and 
withdrawal of the rejection are requested in view of the following further comments. 

The cited art is understood to describe a method for producing a metagenomic 
library by using BAC vectors. In this aim, DNA is extracted from the soil and inserted in 
these vectors in order to obtain a library of about 1000 Mbp with an average insert size 
of about 44.5 kb. 
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A phylogenetic analysis is then performed on this library by using amplification 
and sequencing of 16S RNA. Furthermore, a screening for specific biological activities 
is performed in order to select clones exhibiting the desired activity. The localization and 
identification of the gene of interest is obtained by random mutagenesis with a 
transposon. 

Unlike the cited art, the presently claimed invention provides, for example, a 
method to analyze metagenomic library allowing analysis of environmental DNA 
contained in library by using a different expression systems. 

Prior to the claimed invention, it was not possible to express some biochemical 
activities, and some activities were not expressible in sufficient amounts, in host cells 
used to commonly produce the library, such as E. coli. These differential expressions 
could be due to transcription, translation/post-translation problems or lack of essential 
partners of chain reaction in the host cell. In addition, these activities could be coded by 
operons comprising several consecutive genes. In order to detect these activities, the 
coding operon has to be contained in its entirety in the cloning vector. Consequently, 
cloned polynucleotide have to be large. However, the maintenance of large foreign DNA 
fragment in an episomic vector in host cells is often problematic. 

In the presently claimed invention, vectors containing the polynucleotide of 
interest are modified in order to be transferable to other types of host cells to test the 
expression of the activity in other expression systems, and to integrate the 
polynucleotide of interest into the genome of the new host cell for stable expression and 
maintenance. The cited art fails to teach or suggest integration of polynucleotides of 
interest in the genome of a new host cell, as presently claimed. 
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The selected vectors containing the polynucleotide of interest is preferably 
modified by targeted insertion, into this vector, of a target polynucleotide construct 
which contains genetic elements allowing the transfer in a new host cell and the 
integration of the polynucleotide of interest into the genome of this new (selected) host 
cell. See for example claim 28. 

The insertion of this target polynucleotide construct into the vector according to 
one embodiment of the claimed invention occurs in a region of the vector distinct from 
the polynucleotide in order to not disturb its expression. See for example claim 29. 

The target polynucleotide construct conferring transferable and integrable 
properties to the cloning vector according to a further embodiment of the invention, 
typically comprises the following genetic elements: 

an origin of transfer which allows conjugative transfer (such as in claims 30, 31 , 

32); 

a nucleic acid encoding an integrase allowing integration of the vector (or of the 
polynucleotide of interest contained therein) into the genome of the new (selected) host 
cell (such as in claims 33 and 34); and/or 

a transcriptional promoter which is able to initiate gene transcription in the new 
(selected) host cell - the expression of the polynucleotide of interest initiated by this 
promoter allowing study of the desired activity in the selected host cell (such as in claim 
35). 

In a further embodiment of the claimed invention, the target polynucleotide 
construct is contained in or comprises a transposable nucleic acid construct, in order to 
be inserted in the selected vector (e.g., claim 36). This transposable nucleic acid 
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construct is derived from transposons which are genetic elements capable of moving 
from one genetic loci to another. 

According to a further embodiment of the invention, this transposable nucleic 
acid construct comprises two inverted repeats involved in transposition, a marker gene 
to select cells containing a vector in which the transposable nucleic acid construct is 
inserted, and the target polynucleotide construct as defined above (e.g., claim 37). 

The figure below shows, for illustrative purposes only and in a non-limiting 
manner, the structure of the vector after insertion of the transposable nucleic acid: 



cloning 
vector 



IR 


MG 


TPC 


IR 



IR: inverted repeat 
MG: marker gene 

TPC: target polynucleotide construct typically comprising an integrase, an origin 
of transfer and a transcriptional promoter. 

The cloning vector comprises in another region the polynucleotide of interest. 

The method of claim 38, for example, comprises a double selection to ensure 
that the insertion occurred at the site of the first marker gene, initially present in the 
cloning vector, and consequently outside the polynucleotide of interest. Cells are 
positively selected for the second marker and negatively selected for the first. 

The cited art of Rondon is understood, as noted above, to describe modifications 
of the plasmid containing environmental DNA to identify gene of interest, such as with 
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transposon (by inactivation and sequencing from transposon ends), without teaching or 
suggesting, for example, integrating polynucleotides of interest into the genome of a 
host cell. 

The claims are believed to define over the cited art and withdrawal of the Section 

102 rejection is requested. 

The Section 103 rejection of claims 24-32, 35, 39-42 and 47 over Radon and 
Chain (Journal of Bacteriology, Vol. 182: 5486-5494 (2000) is traversed. The Section 

103 rejection of claims 24-35, 39-42 and 47 over Radon, Chain and Groth (PNAS, vol 
97, 5995-6100 (2000) is traversed. The Section 103 rejection of claims 24-42 and 47 
over Radon, Chain, Groth and Berg (PNAS vol 79; 2632-2635 (1982) and Devine (U.S. 
Patent No. 5,728,551) is traversed. Reconsideration and withdrawal of the rejections 
are requested in view of the above comments regarding the deficiencies of the primary 
reference (Radon) and the following further distinguishing remarks. 

Specifically, Chain et al. is understood to describe an in vivo cloning method to 
clone very large fragments of bacterial genomic DNA. Firstly, cassettes containing origin 
of transfer (or/7), replication origin (or/V, suitable only for receiving cells) and antibiotic 
resistance gene, are inserted into specific and determined sites of the genome. Flanking 
regions are transferred into a replicative plasmid by a site specific recombination 
mechanism. This plasmid is then transferred in the receiving cell by using triparental 
conjugation. 

In this method, the cassette containing genetic elements required for the 
conjugative transfer is inserted into the genome of the cell and not into a cloning vector 
contained in the cell, as required by the presently claimed invention. 
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Moreover, the cassette contains only elements required for the conjugative 
transfer and not for the integration of the transferred DNA into the genome of the 
receiving cell. 

Thus, the method of Chain is understood to be only intended to pick up very 
large fragments of genomic DNA in BAC vectors (useful in genome sequencing) in 
order to maintain them into replicative episomic plasmids and not to integrate them into 
the genome of the receiving cell. 

Chain is not believed to cure the deficiencies noted above with regard to Radon 
and there is not believed to be motivation or a suggestion in either document to alter 
Chain in a manner which would be required to have made the presently claimed 
invention. 

For completeness, the applicants note that Chain is believed to underscore 
maintenance difficulties related to large fragment cloning and the differential gene 
expression according to different expression systems (p 5492, column 2, 2 nd 
paragraphe). Chain is not believed to suggest any solution to this problem however. 

The claimed invention would not have been obvious from the combined teaching 
of Radon and Chain. 

Groth fails to cure the deficiencies of Radon and Chain noted above. 

Specifically, Groth et al. is understood to describe the use of the C31 integrase to 
operate site specific unidirectional integrations into the genome of the host cell. This 
document demonstrates that this enzyme could be used in human cells. The document, 
taken with Radon and Chain is not believed to have made the presently claimed 
invention obvious. 
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The combined teachings of Radon, Chain, Groth and Berg would not have made 
the claimed invention obvious. The deficiencies of Radon, Chain and Groth are 
discussed above. The teachings of Berg would not have overcome the deficiencies of 
the previously-discussed cited art. Berg et al. is understood to describe fundamental 
principles of the use of transposon elements in molecular biology, i.e. inverted repeats, 
transposase and marker gene (antibiotic resistance). 

None of these documents teach or suggest the claimed method, such as the 
method of claim 24, to analyze a metagenomic library. None of the cited documents or 
their combination suggest modifying cloning vectors of the library to allow the transfer in 
other expression host cells and the integration of the polynucleotide of interest into the 
genome of these cells. 

Rondon et al. describes, at best, a method to produce a library and to identify 
gene(s) coding for the desired activity. 

Chain et al., Groth et al. and Berg et al. describe, at best, the use of origins of 
transfer, integrases and elements of transposon, respectively, in order to modify DNA 
constructs. These modifications are not used on cloning vectors to change the host cell 
and to perform integration into the genome of the host cell. 

The claims are submitted to be patentable over the cited art and withdrawal of 
the Section 103 rejections is requested. 

The claims are submitted to be in condition for allowance and a Notice to that 
effect is requested. The Examiner is requested to contact the undersigned, preferably 
by telephone, in the event anything further is required to place the application in 
condition for allowance. 
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Respectfully submitted, 
NIXON & VANDERHYE P.C. 



By: IB. J. Sadoff/ 

B. J. Sadoff 
Reg. No. 36,663 

BJS: 

901 North Glebe Road, 11th Floor 
Arlington, VA 22203-1808 
Telephone: (703)816-4000 
Facsimile: (703)816-4100 
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Method for the expression of unknown environmental DNA into adapted host cells 

This application is the US national phase of international application 
PCT/EP2003/007765 filed 17 July 2003 which designated the U.S. and claims benefit of 
EP 02291871.8, dated 24 July 2002, the entire contents of each of which are hereby 
incorporated by reference. 

Introduction and Background 

The present invention relates to methods and compositions for nucleic acid production, 
analysis and cloning. The present invention discloses tools and methods for the 
production and analysis of libraries of polynucleotides, particularly metagenomic 
libraries, which can be used to identify novel pathways, novel enzymes and novel 
metabolites of interest in various areas, including pharmaceutical, cosmetic, 
agrochemical and/or food industry. 

Drug discovery process is based on two main fields, namely combinatorial chemistry 
and natural products. Combinatorial chemistry has shown its ability to generate huge 
amounts of molecules, but with limited chemical diversity. At the opposite, natural 
products have been the most predominant source of structural and molecular diversity. 
However, the exploitation of this diversity is strongly hampered by their limited access, 
complex identification and purification processes, as well as by their production. 

Microorganisms are known to synthesize a large diversity of natural compounds which 
are already widely used in therapeutic, agriculture, food and industrial areas. However, 
this promising approach to the identification of new natural compounds has always been 
considerably limited by the principal technological bolts of isolating and in vitro 
propagating the huge diversity of bacteria. Most microorganisms living in a natural, 
complex environment (soil, digestive tract, sea, etc..) have not been cultivated because 
their optimal living conditions are either unknown or difficult to reproduce. Numbers of 
scientific publications relate this fact and it is now assumed that less than about 1% of 
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the total bacterial diversity (when all environments are considered together) have been 
isolated and cultivated (Amann et al 9 1995). 

New approaches have been developed to try to overpass the critical step of isolation, and 
to access directly to the huge genetic potential established by the microbial adaptation 
processes through their long evolution. These approaches are called "Metagenomic" 
because they address a plurality of genomes of a whole bacterial community, without 
any distinction (metagenome). 

Metagenomics involve direct extraction of DNAs from environmental samples and their 
propagation and expression into a cultivated host cell, typically a bacteria. Metagenomic 
has been firstly developed for the identification of new bacterial phylum (Pace .1997). 
This use is based on the specific cloning of genes recognized for their interest as 
phylogenetic markers, such as 16S rDNA genes. Further developments of Metagenomics 
relate to the detection and cloning of genes coding for proteins with environmental or 
industrial interest. These first two applications of metagenomic involve a first step of 
gene selection (generally using PCR) before cloning. In the case of protein production, 
the cloning vector used are preferentially also expression vectors, i.e., they contain 
regulatory sequences upstream of the cloning site causing expression of the cloned gene 
in a given bacterial host strain. 

More recent developments of metagenomic consider the total metagenome cloned 
without any selection and/or identification, to establish random "Metagenomic DNA 
libraries". This provides an access to the whole genetic potential of bacterial diversity 
without any "a priori" selection. Metagenomic DNA libraries are composed of hundreds 
of thousands of clones which differ from each other by the environmental DNA 
fragments which have been cloned. In this respect, large DNA fragments have been 
cloned (more than 30 Kb), so as to (i) limit the number of clones which have to be 
analysed and (ii) to be able to recover whole biosynthetic pathways for the identification 
of new metabolites resulting from multi enzymatic synthesis. This last point is of 
particular interest for bacterial metagenomic libraries since, most the biosynthetic 
pathways have been found to be naturally organised in a same cluster of DNA and even 
in the same operon in bacteria. Nevertheless, the heterologuous expression of a whole 
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biosynthetic pathways (large DNA fragment) needs a much more improved system than 
a simple expression vector to have a full and stable expression. 

Except for the identification and characterisation of bacterial community at the 
phylogenetic or diversity levels, metagenomic libraries produced in the prior art are gene 
expression libraries, i.e., the environmental DNA fragments are cloned downstream of a 
functional promoter, to allow their expression and analysis. In this regard, W099/45154 
and W096/34112 relate to combinatorial gene expression libraries which comprise a 
pool of expression constructs where each expression construct contains DNA which is 
operably associated with one or more regulatory regions that drive expression of genes 
in an appropriate organisms. Furthermore, the expression constructs used in these 
methods have a very limited and invariable host range. Similarly, WO 01/40497 relates 
to the construction and use of expression vectors which can be transferred in one chosen 
expression bacterial host of the Streptomyces genus. All these approaches are, however, 
very limited since they require the presence of expression signals and confer invariable 
or very limited host range capabilities. Furthermore, most (if not all) metagenomic DNA 
libraries have been established in E. coli which is the most efficient cloning system. 
However, most environmental DNA are not expressed or functionally active in E. coli. 
In particular, functional analysis in E. coli of genes cloned from G+C rich organisms, 
such as Actinomyces, could be limited by the lack of adequate transcription and 
translation system. Also, posttranslational modification system in E. coli is not operative 
on heterologous proteins from Actinomicetes and some specific substrates for proteins 
activity are not present in E. coli. 

The stable maintenance of large foreign DNA fragments (> 1 0Kb) into a selected host 
cell is one of the key points for academic research or applied industrial purposes. 
Usually, the vector carrying the foreign DNA is maintained by cultivating the host cells 
in a medium with a vector-specific selective pressure (resistance to an antibiotic for 
example). However, when large foreign DNA fragments are cloned and/or expressed, 
their propagation and/or expression require energy, which is not allocated for cell 
growth anymore. As a consequence of this new resource allocation (nutrients/energy), it 
is not unusual to have a genetic rearrangement of the foreign DNA (deletion, 
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modification etc. . .) as a recombinant cell reaction. This results in the modification of the 
foreign genetic information and in the loss of DNA functionality. This can be observed 
without any loss of the selective pressure carried by the vector. As a result, the 
recombinant clone is no more exploitable for genetic or functional analysis. 

Thus, the exploitation of the huge potential of metagenomics for the discovery of new 
natural compounds, pathways or genes cannot be achieved with currently existing 
methods. Alternative technologies and processes must be developed, to allow stable 
maintenance and propagation of large foreign DNAs into host cells for production of 
efficient libraries and functional screening in a large variety of host cell species, 
including Bacillus or Streptomyces, to take full account of the huge diversity of the 
environmental DNAs. 

Summary of the Invention 

The present invention discloses improved tools and methods for the production and 
analysis of libraries of polynucleotides, particularly metagenomic libraries, which can be 
used to identify and produce novel pathways, novel enzymes and novel metabolites of 
interest. 

More particularly, the invention now proposes to keep the advantage of high efficient 
cloning in E. coli and to modify the properties of metagenomic libraries, to allow genetic 
and functional analyses of particular selected clones in any appropriate system, thereby 
making possible the stable maintenance and propagation, the analysis and/or the 
expression of the huge diversity of metagenomic libraries. According to the invention, 
polynucleotide libraries can be produced in any convenient cloning system, such as E. 
coli, and then modified, depending on the desired selection or screening system, to adapt 
host range and/or properties of the library (or a portion thereof). 

A particular object of this invention resides more specifically in a method of analysing a 
library of polynucleotides, said polynucleotides being contained in cloning vectors 
having a particular host range, the method comprising (i) selecting cloning vectors in the 
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library which contain a polynucleotide having a particular characteristic, (ii) modifying 
said selected cloning vectors to allow a transfer of said vectors into a selected host cell 
and integration of the polynucleotide contained in said vectors into the genome of the 
selected host cell, and (iii) analysing the polynucleotides contained in said modified 
vectors upon transfer of said modified vectors into said selected host cell. 

An other object of this invention is a library of polynucleotides, wherein said library 
comprises a plurality of environmental DNA fragments cloned into cloning vectors, 
wherein said environmental DNA fragments contain a common molecular characteristic 
and wherein said cloning vectors are E. coli cloning vectors comprising a target 
polynucleotide construct allowing (i) transfer of the environmental DNA into a selected 
host cell distinct from E. coli, (ii) integration of the environmental DNA into the genome 
of a selected host cell, and (iii) stable maintenance and propagation of the environmental 
DNA into the selected host cell. 

A further object of this invention is a method of producing modified libraries of 
polynucleotides, the method comprising selecting a sub-population of clones in a first 
library, based on the presence or absence of a characteristic of interest, and modifying 
the properties of said selected clones to allow their functional analysis or expression. 

The modification in the library or cloning vector is typically obtained by targeted 
insertion of a polynucleotide construct, preferably using transposable elements, either in 
vitro or in vivo. 

The integration into the genome of the selected host cell is typically obtained by site 
specific integration or by homologous or heterologous DNA/DNA recombination. 

The invention is particularly suited for producing and analysing genetic diversity 
(metagenomic libraries), to identify new genes and isolate new metabolites, drugs, 
enzymes, antibiotics, etc. 
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Legend to the Figures 

Figure 1 : Map of pPl vector carrying transposable construct Tn<Apra> fig. la. Excised 
transposable construct is 1 132 bps in size. It contains two mosaic ends (ME) and a gene 
conferring resistance to apramycine (Apra) fig. lb. 

Figure 2 : pPLl Vector (fig. 2a) that carries the conjugative transposable construct 
Tn<Apra-oriT> (fig. 2b). The nucleotide sequence of the transposable construct contains 
an origin of transfer (oriT) and a gene conferring resistance to apramycine (Apra). 
Direction of DNA transfer at oriT is shown by an arrow. 

Figure 3 : Transposable construct Tn<Apra-oriT-att-int> (fig. 3a) on vector pPAOI6 
(fig. 3b). The transposable construct contains <DC31 integrase gene and attachment DNA 
sequence for site specific integration, origin of transfer and gene for selection. The 
orientation of genes and direction of DNA transfer are marked by arrows. 

Figure 4 : This figure shows any target DNA suitable for insertion of transposable 
construct (fig. 4a). Insertion of conjugative and site specific integrative transposable 
construct Tn<Apra-oriT-int> is shown on fig. 4 b and c. Insertion of transposable 
construct into selective gene marker carried on original vector is shown on fig. 4b. In 
this event, the cloned insert is intact and can be transferred to heterologous host. 
Insertion of transposable construct into cloned DNA insert results in gene inactivation 
(fig. 4c). 

Figure 5 : Complete annotated DNA sequences of fosmid clones FS3 12^1 (a, SEQ ID 
NO: 1) and FS3 135 (b, SEQ ID NO: 2). 

Figure [[6]]5 : Morphological differences between Streptomyces transconjugant (assay). 
Conjugations have been performed with FS3-124 modified with transposable construct 
pPAOI6; (control) conjugation have been performed with pPAOI6. 
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Figure [[7]]6: Schematic Map of pPSB vector (top) and transposable elements (bottom). 
The transposable element has 720 bp DNA of the amyE gene from B. subtilis in 
addition to att-int-oriT-Apra. 

Figure [[8]]7: Plasmid pPSBery (top) and transposable element (bottom) carrying 
selective marker ery AM, for resistance to erithromycine, and part of amy E gene for 
homologus recombination in B. subtillis. 

Figure [[9]]8: Integrase <E> C31 was deleted from pPSBery plasmid. Resulting plasmid is 
pPSBery-DI (top). Transposable element contains Apra and ery AM genes for selection, 
oriT origin of transfer and a part of amyE gene for integration in to amyE locus of B. 
subtilis chromosome (bottom). 

Figure [[10]]9: Map of pTn5-7 AOI plasmid. Transposable element has ends of tn5 
(ME) and tn7 (T7 R, T7 L) transposons. 

Detailed Description of the Invention 

The invention provides novel strategies, methods and products for generating and 
analysing combinatorial gene libraries. As indicated above, the invention discloses, 
particularly, methods of analysing libraries of polynucleotides, said polynucleotides 
being contained in cloning vectors having a particular host range, the methods 
comprising (i) selecting cloning vectors in the library which contain a polynucleotide 
having a particular characteristic, (ii) modifying said selected cloning vectors to allow a 
transfer of said vectors and/or expression of the polynucleotide which they contain into a 
selected host cell, and (iii) analysing the polynucleotides contained in said modified 
vectors upon transfer of said modified vectors into said selected host cell, such as by 
genetic, biochemical, chemical or phenotypical approaches. 

In a most preferred embodiment, the methods allow stable transfer and propagation of 
large environmental nucleic acids in a selected host following initial selection. Such 
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methods comprise (i) selecting cloning vectors in a library which contain a 
polynucleotide having a particular characteristic, (ii) modifying said selected cloning 
vectors to allow a transfer of said vectors and integration of the polynucleotide which 
they contain into a selected host cell genome, and (iii) analysing the polynucleotides 
contained in said modified vectors upon transfer of said modified vectors into said 
selected host cell, such as by genetic, biochemical, chemical or phenotypical 
approaches. 

Library of polynucleotides 

The term "library of polynucleotides" designates a complex composition comprising a 
plurality of polynucleotides, of various origins and structure. Typically, the library 
comprises a plurality of unknown polynucleotides, i.e., of polynucleotides whose 
sequence and/or source and/or activity is not known or characterized. In addition to such 
unknown (or uncharacterized) polynucleotides, the library may further include known 
sequences or polynucleotides. Typically, the library comprises more than 20 distinct 
polynucleotides, more preferably at least 50, typically at least 100, 500 or 1000. The 
complexity of the libraries may vary. In particular, libraries may contain more than 
5000, 10 000 or 100 000 polynucleotides, of various origin, source, size, etc. 
Furthermore, the polynucleotides are generally cloned into cloning vectors, allowing 
their maintenance and propagation in suitable host cells, typically in E. coli. The 
polynucleotides in the library may be in the form of a mixture or separated from each 
other, in all or in part. It should be understood that some or each polynucleotide in the 
library may be present in various copy numbers. 

The polynucleotides in the libraries are more preferably obtained or cloned from 
complex sources of nucleic acids, most preferably from environmental samples. Such 
libraries are also termed "metagenomic libraries" since they contain nucleic acids 
derived from whole genomes of mixed populations of microorganisms. 

The term environmental sample designates, broadly, any sample containing (a plurality 
of) uncharacterized (micro)organisms, particularly uncultivated (or non-cultivable) 
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microorganisms. The sample may be obtained or derived from specific organisms, 
natural environments or from artificial or specifically created environments (e.g., 
industrial effluents, etc). An uncultivated (or non-cultivable) microorganism is a 
microorganism that has not been purposely cultured and expanded in isolated form. The 
sample may be obtained or derived from soil, water, mud, vegetal extract, wood, 
biological material, marine or estuarine sediment, industrial effluents, gas, mineral 
extracts, sand, natural excrements, meteorits etc. The sample may be collected from 
various regions or conditions, such as tropical regions, deserts, volcanic regions, forests, 
farms, industrial areas, household, etc. 

Environmental samples usually contain various species of (uncharacterized, 
uncultivated) microorganisms, such as terrestrial microorganisms, marine 
microorganisms, salt water microorganisms, freshwater microorganisms, etc. Species of 
such environmental microorganisms include autotrophe or heterotrophe organisms, 
eubacteria, archaebacteria, algae, protozoa, fungi, viruses, phages, parasites, etc. The 
microorganisms may include extremophile organisms, such as thermophiles, 
psychrophiles, psychrotophes, acidophiles, halophiles, etc. More specific examples of 
environmental bacteria includes actinomycetes, eubacteriaes and mycobacteriaes, 
examples of fungi include phycomycetes, ascomycetes and basidiomycetes, etc. Other 
organisms include yeasts (saccharomyces, kluyveromyces, etc.) plant cells (algae, 
lichens, etc), corals, etc. for instance. The sample may comprise various species of such 
donor (uncultivated) microorganisms, as well as various amounts thereof. The 
environmental sample may contain, in addition, known and/or cultivable 
microorganisms (e.g., prokaryotic or eukaryotic), as well as nucleic acids and organic 
materials. The sample may also contain different animal cells: mammalian cells; insect 
cells, etc (arising from larvae, feces, etc.). 

It should be understood that the present invention is not limited to any specific type of 
sample or environmental microorganism, but can be used to produce diversity, create 
nucleic acid libraries, etc., from any environmental sample comprising uncultivated 
microorganisms. The sample may be wet, soluble, dry, in the form of a suspension, 
paste, powder, solid, etc. Preferably, the sample is dry or in solid or semi-solid state 
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(e.g., paste, powder, mud, gel, etc.). The sample may be treated prior to nucleic acid 
extraction, for instance by washings, filtrating, centrifuging, diluting, drying, etc. 

The term "environmental DNA" designates any DNA fragment or collection obtained 
from an environmental sample. Nucleic acids may be extracted / isolated from the 
sample according to various techniques, such as those described in WOO 1/8 1357, in 
WOO 1/40497, in Handelsman et al. (Chemistry & Biology 5(10), 1998, R245), Rondon 
et al (Tibtech 17, 1999, 403 ; Applied and Environm. Microbiol. 66, 2000, 2541), 
Miller et al (Applied and Environm. Microbiol. 65, 1999, 4715) or Frostegard et al. 
(Applied and Environm. Microbiol., 65, 1999, 5409). 

In a particular embodiment of the above method, the library comprises a plurality of 
environmental DNA fragments. The library may also comprise other types of nucleic 
acids, such as environmental RNAs, for instance. 

The polynucleotides have a size which is typically comprised between 10 and 100 kb, 
more preferably between 20 and 80 kb, typically between 30 and 80 kb. Although not 
mandatory, it is preferred that the polynucleotide fragments in the library all have 
similar size, to produce homogenous libraries. 

Cloning Vectors 

As indicated, the polynucleotides are contained or cloned into cloning vectors. These 
vectors may be of various types, including plasmids, cosmids, fosmids, episomes, 
artificial chromosomes, phages, viral vectors, etc. In most preferred embodiments, the 
cloning vectors are selected from plasmids, cosmids, phages (e.g., P 1 -derivatives) and 
BACs, even more preferably from cosmids, PI derivatives and BACs. By using cosmids 
or PI derivatives, it is possible to generate homogenous libraries, since these vectors 
essentially accommodate polynucleotides having a size of approximately 40 kb and 80 
Kb, respectively. Furthermore, since the invention provides that the vectors are modified 
after the initial cloning step, the cloning capacity of the vectors is maximized and inserts 
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of as much as 40 and 80 kbs in length can be cloned into fosmids, BAC and PI 
derivatives. 

As indicated the cloning vector has a particular host range, i.e., the ability to replicate in 
a particular type of host cell. Typically, the host is a bacteria, more preferably an E. coli 
strain. Indeed, E. coli is so far the most convenient host cell for performing recombinant 
technologies. The advantage of the present invention is that the starting library can be 
produced in any suitable host system of choice, since the properties of the libraries will 
be adapted later during the process. 

Cloning vectors generally comprise the polynucleotide insert and genetic elements 
necessary and sufficient for maintenance into a competent host cell. They typically 
contain, in addition to the polynucleotide insert, an origin of replication functional in a 
selected host cell as well as a marker gene for selection and screening. The cloning 
vector may comprise additional elements, such as promoter regions, for instance. 
Although cloning vectors may replicate in several different host cells, they are usually 
adapted to a particular host cell type and not suitable or efficient for replication or 
maintenance in other cell types. 

In a preferred embodiment, the cloning vectors of the library are E. coli cloning vectors, 
preferably cosmids , BAC or PI derived vectors. E. coli cloning vectors may carry an 
origin of replication derived from naturally-occurring plasmids, such as ColEl, pACYC 
and pl5A, for instance. Many E.coli cloning vectors are commercially available and/or 
can be constructed using available regulatory sequences. 

Screening of the cloning vectors 

In step i) of the method, a first selection or screen is performed on the polynucleotide 
library. The screen is performed so as to identify or select clones having (or lacking) a 
particular, common characteristic. The selection may be carried out according to various 
techniques, such as molecular screening, protein expression, functional screening, etc. A 
preferred selection is performed by molecular screening. Molecular screening designates 



Substitute Sepcification Marked-up copy - Appln. No. 10/522,037 

any method of identification of molecular or structural characteristics in a 
polynucleotide sequence. This can be made by a variety of techniques which are known 
per se, such as hybridisation, amplification, sequencing, etc. Preferably, molecular 
screening comprises the selection of clones in a library which contain, in their sequence, 
a particular sequence or region or motif, said sequence or region or motif being 
characteristic of a particular type of activity or gene (enzyme, biosynthetic pathways, 
etc.). 

In a first variant, the selection is made by contacting the cloning vectors in the library 
with a particular nucleic acid probe (or set of probes) containing a sequence which is 
characteristic of a selected activity or function (a consensus sequence, a particular motif, 
etc.). The cloning vectors in the library which hybridise to the probe (or set of probes) 
are then selected. 

In a second variant, the selection is made by contacting the cloning vectors in the library 
with a particular pair of nucleic acid primers specific for a sequence which is 
characteristic of a selected activity or function (a consensus sequence, a particular motif, 
etc.), and a PCR amplification reaction is performed. The cloning vectors in the library 
which lead to a positive amplification product are then selected. 

In this regard, the present application provides new primers designed in conserved 

motives of the P-keto acyl synthase gene, which are particularly useful for screening 

polynucleotides containing putative polyketide synthase (PKS) genes or domains. These 

primers have the following degenerated sequence : 

Sense primer : 5'- GGSCCSKCSSTSDCSRTSGAYACSGC -3' (SEQ ID NO: 3) 
Antisense primer : 5'- GCBBSSRYYTCDATSGGRTCSCC -3' (SEQ ID NO: 4) 

wherein : 

R is A or G 

S is G or C 

Y is C or T 

K is G or T 

D is A or G or T, and 
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B is C or G or T. 

A particular object of this invention is a polynucleotide primer having one of the above 
sequences, typically a mixture of different polynucleotide primers having a sequence 
corresponding to one of the above degenerated sequences. A particular object of this 
invention also resides in a pair of primers each having one of the above sequences. 

Once particular clones have been selected, the analysis of their polynucleotides needs to 
be confirmed and/or validated, and/or their polynucleotides can be used to study their 
function and/or produce novel compounds or metabolites. 

The invention now enables such further analysis and uses, by allowing a modification of 
the cloning vectors that is specific and adaptable by the skilled person, depending on the 
activity which is sought. In particular, it is possible to confer properties such as specific 
expression or a novel, specific host range to the selected cloning vectors, to assess their 
activity, as disclosed below. 

Modification of the cloning vectors 

After high efficiency cloning using most convenient cloning vectors such as BACs or 
cosmids propagated into E. coli, and after the identification, selection and/or 
characterisation of cloned DNA fragments, the invention now allows to modify 
specifically the cloning vectors to transfer, integrate into the genome, maintain, express 
and/or over-express the selected polynucleotides into any selected host expression 
system, which is suitable to assess the selected activity or property. Such selected hosts 
may be native or heterologous host cells, and include, but are not limited to, for example 
Streptomyces, Nocardia, Bacillus, fungi, yeasts, etc. 

The selected cloning vectors of the library may be modified according to various 
techniques. The modification is typically a genetic modification, comprising the 
introduction of particular genetic sequences into the structure of the cloning vector, in 
addition to or in replacement of sequences contained in said vector. It is highly preferred 
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to use specific or targeted (or oriented) techniques to improve the efficacy of the 
method. By "specific" is meant that the modification occurs at a pre-determined location 
in the cloning vector, through site-specific mechanisms. By "targeted" is meant that the 
modification occurs in a controlled way, so as not to alter the polynucleotide insert 
contained in the vector in a non-desirable way. 

In a preferred embodiment, the selected vectors are modified by insertion, into the 
vector, of a target polynucleotide construct which contains genetic elements conferring 
the selected property(ies) to the cloning vector. 

The target polynucleotide construct typically comprises the genetic elements necessary 
to transfer, propagate, integrate into the genome, maintain, express or overexpress the 
cloned polynucleotide into a chosen (bacterial) host expression system. Said genetic 
elements may include particular origin(s) of replication, particular origin(s) of transfer, 
particular integrase(s), transcriptional promoter(s) or silencer(s), either alone or in 
combination(s). 

In a first, preferred variant, the target polynucleotide construct comprises a genetic 
element allowing transfer of the vector into a selected host cell. 

Natural DNA transfer mechanisms between donor and recipient strains is known under 
the term conjugation or conjugative transfer. Conjugative transfer can occur between 
different strains of the same species as well as between strains of different species. 
Many naturally occurring plasmids carry so called tra genes, which are involved in and 
mediate conjugative transfer. The DNA transfer starts at specific DNA structures, known 
as an origin of transfer or "ori T". The presence of such an oriT in a vector allows said 
vector to be transferred into a desired host cell. 

In a particular, preferred embodiment, the target polynucleotide construct comprises an 
origin of transfer functional in the selected host cell. 
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The structure of various oriT has been reported in the art (Guiney et al. 1983; Zechner et 
al. 2000). In a specific embodiment, the origin of transfer is selected (or derived) from 
RP4, pTiC58, F, RSF1010, ColEl and R6K(a). 



A specific example of an oriT which can be used in the present invention derives from 
plasmid RP4 and has or comprises all or a functional part of the following sequence 
(SEQ ID NO: 5) : 



gatctGTGATGTACTTCACCAGCTCCGCGAAGTCGCTCTTCTTGATTGGAGCGCATGGG 

GACGTGCTTGGCAATCACGCGCACCCCCCGGCCGTTTTAGCGGCTAAAAAAGTCATG 

GCTCTGCCCTCGGGCGGACCACGCCCATCATGACCTTGCCAAGCTCGTCCTGCTTCT 

CTTCGATCTTCGCCAGCAGGGCGAGGATCGTGGCATCACCGAACCGCGCCGTGCGC 

GGGTCGTCGGTGAGCCAGAGTTTCAGCAGGCCGCCCAGGCGGCCCAGGTCGCCATT 

GATGCGGGCCAGCTCGCGGACGTGCTCATAGTCCACGACGCCCGTGATTTTGTAGCC 

CTGGCCGACGGCCAGCAGGTAGGCCGACAGGCTCATGCCGGCCGCCGCCGCCTTTT 

CCTCAATCGCTCTTCGTTCGTCTGGAAGGCAGTACACCTTGATAGGTGGGCTGCCCT 

TCCTGGTTGGCTTGGTTTCATCAGCCATCCGCTTGCCCTCATCTGTTACGCCGGCGGT 

AGCCGGCCAGCCTCGCAGAGCAGGATTCCCGTTGAGCACCGCCAGGTGCGAATAAG 

GGACAGTGAAGAAGGAACACCCGCTCGCGGGTGGGCCTACTTCACCTATCCTGCCC 

GGCTGACGCCGTTGGATACACCAAGGAAAGTCTACACGAACCCTTTGGCAAAATCC 

TGTATATCGTGCGAAAAAGGATGGATATACCGAAAAAATCGCTATAATGACCCCGA 

AGCAGGGTTATGCAGCGGAAAAGATCCGTCGGATCT 



The term "functional part" designates any fragment or variants of the above sequence 
which retain the capacity to cause conjugative transfer. Such fragments typically 
comprise at least 80%, preferably at least 85% or 90% of the above sequence. Variants 
may include one or several mutations, substitutions, deletions or additions of one or 
several bases. 



In an other particular variant, the target polynucleotide construct comprises a genetic 
element allowing integration of the vector (or of the polynucleotide contained therein) 
into the genome of the selected host cell. 



A donor DNA is permanently or stably maintained and expressed in a selected recipient 
cell if it is integrated into the recipient cell's genome or if it contains elements that allow 
autonomous replication in said cell. In a most preferred embodiment, the vector is 
modified to allow transfer and integration of the polynucleotide into the host cell 
genome. 
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Integration is a preferred way of ensuring stable expression. Integration can be obtained 
by physical recombination. Recombination can be homologous, e.g., between two 
homologous DNA sequences, or illegitimate, where recombination occurs between two 
non-homologous DNAs. As a particular example, integration of donor DNA into the 
chromosome of the recipient can be mediated by host recombination repair system or by 
site-specific recombination. Another well-studied process that can transfer and integrate 
genes is transduction by bacterial viruses, such as X and ()>C31. In a phage-infected 
bacterial cell, fragments of the host DNA are occasionally packaged into phage particles 
and can then be transferred to a recipient cell. Integration into the recipient cell's 
genome is caused by an integrase. 



In a specific embodiment, the target polynucleotide construct comprises a nucleic acid 
encoding an integrase functional in the selected host cell. More preferably, the integrase 
is selected from X and 4>C3 1 integrases. In a specific embodiment, the polynucleotide 
construct comprises a nucleic acid encoding an integrase having or comprising all or a 
functional part of the following sequence of the ())C31 integrase (SEQ ID NO: 6) : 



AGATCTCCCGTACTGACGGACACACCGAAGCCCCGGCGGCAACCCTCAGCGGATGC 

CCCGGGGCTTCACGTTTTCCCAGGTCAGAAGCGGTTTTCGGGAGTAGTGCCCCAACT 

GGGGTAACCTTTGAGTTCTCTCAGTTGGGGGCGTAGGGTCGCCGACATGACACAAG 

GGGTTGTGACCGGGGTGGACACGTACGCGGGTGCTTACGACCGTCAGTCGCGCGAG 

CGCGAGAATTCGAGCGCAGCAAGCCCAGCGACACAGCGTAGCGCCAACGAAGACA 

AGGCGGCCGACCTTCAGCGCGAAGTCGAGCGCGACGGGGGCCGGTTCAGGTTCGTC 

GGGCATTTCAGCGAAGCGCCGGGCACGTCGGCGTTCGGGACGGCGGAGCGCCCGGA 

GTTCGAACGCATCCTGAACGAATGCCGCGCCGGGCGGCTCAACATGATCATTGTCTA 

TGACGTGTCGCGCTTCTCGCGCCTGAAGGTCATGGACGCGATTCCGATTGTCTCGGA 

ATTGCTCGCCCTGGGCGTGACGATTGTTTCCACTCAGGAAGGCGTCTTCCGGCAGGG 

AAACGTCATGGACCTGATTCACCTGATTATGCGGCTCGACGCGTCGCACAAAGAATC 

TTCGCTGAAGTCGGCGAAGATTCTCGACACGAAGAACCTTCAGCGCGAATTGGGCG 

GGTACGTCGGCGGGAAGGCGCCTTACGGCTTCGAGCTTGTTTCGGAGACGAAGGAG 

ATCACGCGCAACGGCCGAATGGTCAATGTCGTCATCAACAAGCTTGCGCACTCGAC 

CACTCCCCTTACCGGACCCTTCGAGTTCGAGCCCGACGTAATCCGGTGGTGGTGGCG 

TGAGATCAAGACGCACAAACACCTTCCCTTCAAGCCGGGCAGTCAAGCCGCCATTC 

ACCCGGGCAGCATCACGGGGCTTTGTAAGCGCATGGACGCTGACGCCGTGCCGACC 

CGGGGCGAGACGATTGGGAAGAAGACCGCTTCAAGCGCCTGGGACCCGGCAACCGT 

TATGCGAATCCTTCGGGACCCGCGTATTGCGGGCTTCGCCGCTGAGGTGATCTACAA 

GAAGAAGCCGGACGGCACGCCGACCACGAAGATTGAGGGTTACCGCATTCAGCGCG 

ACCCGATCACGCTCCGGCCGGTCGAGCTTGATTGCGGACCGATCATCGAGCCCGCTG 

AGTGGTATGAGCTTCAGGCGTGGTTGGACGGCAGGGGGCGCGGCAAGGGGCTTTCC 
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CGGGGGCAAGCCATTCTGTCCGCCATGGACAAGCTGTACTGCGAGTGTGGCGCCGT 

CATGACTTCGAAGCGCGGGGAAGAATCGATCAAGGACTCTTACCGCTGCCGTCGCC 

GGAAGGTGGTCGACCCGTCCGCACCTGGGCAGCACGAAGGCACGTGCAACGTCAGC 

ATGGCGGCACTCGACAAGTTCGTTGCGGAACGCATCTTCAACAAGATCAGGCACGC 

CGAAGGCGACGAAGAGACGTTGGCGCTTCTGTGGGAAGCCGCCCGACGCTTCGGCA 

AGCTCACTGAGGCGCCTGAGAAGAGCGGCGAACGGGCGAACCTTGTTGCGGAGCGC 

GCCGACGCCCTGAACGCCCTTGAAGAGCTGTACGAAGACCGCGCGGCAGGCGCGTA 

CGACGGACCCGTTGGCAGGAAGCACTTCCGGAAGCAACAGGCAGCGCTGACGCTCC 

GGCAGCAAGGGGCGGAAGAGCGGCTTGCCGAACTTGAAGCCGCCGAAGCCCCGAA 

GCTTCCCCTTGACCAATGGTTCCCCGAAGACGCCGACGCTGACCCGACCGGCCCTAA 

GTCGTGGTGGGGGCGCGCGTCAGTAGACGACAAGCGCGTGTTCGTCGGGCTCTTCGT 

AGACAAGATCGTTGTCACGAAGTCGACTACGGGCAGGGGGCAGGGAACGCCCATCG 

AGAAGCGCGCTTCGATCACGTGGGCGAAGCCGCCGACCGACGACGACGAAGACGAC 

GCCCAGGACGGCACGGAAGACGTAGCGGCGTAGCGAGACACCCG 



The term "functional part" designates any fragment or variants of the above sequence 
which retain the capacity to cause integration. Such fragments typically comprise at least 
80%, preferably at least 85% or 90% of the above sequence. Variants may include one 
or several mutations, substitutions, deletions or additions of one or several bases. 



In a more preferred variant, the target polynucleotide construct comprises genetic 
elements allowing transfer of the cloning vector into the selected host cell and 
integration of the cloning vector or a portion thereof into the genome of the selected host 
cell. Most preferred polynucleotide constructs comprises an oriT and a nucleic acid 
encoding an integrase. 



In an other variant, the target polynucleotide construct comprises an origin of replication 
specific for or functional in the selected host cell. The origin of replication may be 
selected (or derived), for instance, from pAMpi, pSa, 2]um circle, pSam2, pSGl, pIJlOl, 
SCP2, pA387 and artificial chromosomes. 



In an other variant, the target polynucleotide construct comprises a transcriptional 
promoter functional in the selected host cell. As indicated above, in a particular variant, 
the invention allows to modify the cloning vector to enable expression or over- 
expression of the cloned polynucleotides in the selected host. The expression of genes is 
driven mainly by transcriptional promoters, which initiate gene transcription. The type 
of promoter to be used in the present invention can be selected by the skilled person, 
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depending on the selected host cell and type of expression needed. Promoters may be 
ubiquitous or cell-specific, regulated or constitutive, weak or strong. They may be of 
various origins, including promoters isolated from viruses, phages, plant cells, bacterial 
genes, mammalian genes, etc., or they may be artificial or chimeric. Typical examples of 
promoters include T7, T4, LacZ, trp, ara, SV40, tac, XPL, GAL, AOX, hsp-70, etc. 

The target polynucleotide construct is typically a DNA molecule, although RNAs may 
also be used as starting material. It is typically a double-stranded DNA. The target 
polynucleotide construct may be produced by conventional recombinant DNA 
techniques, including DNA synthesis, cloning, ligation, restriction digestion, etc. and a 
combination thereof. 

The target polynucleotide construct is preferably engineered so as to be inserted in a 
region of the vector distinct from the polynucleotide. Indeed, it is important that the 
integrity of the polynucleotides is preserved. Directed insertion may be accomplished in 
a variety of ways, including site-specific insertion using particular enzymatic systems 
(Cre/Lox, FLP, etc.), homologous recombination with particular target sequences 
present in the vector, or by the use of transposons or transposable elements and 
appropriate selection means. 

In a particular, preferred embodiment, the target polynucleotide construct is contained in 
or comprises a transposable nucleic acid construct. Indeed, in a preferred variant, the 
methods of the present invention use transposable elements to alter the properties of the 
cloning vectors, and allow their transfer, maintenance, expression or over-expression in 
a selected host cell. 

Transposable nucleic acid constructs are derived from transposons, which are genetic 
elements capable of moving from one genetic loci to another. Two main classes of 
transposons have been identified in bacteria. The most simple transposons comprise an 
insertion sequence that carries only elements of transposition. These elements are two 
inverted DNA repeats and a gene that codes for a protein called transposase. The 
transposase catalyses the excision and integration of the transposon. It has been shown 
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that the excision and integration reaction can be catalysed in trans by a transposase, 
which can be provided in vivo or in vitro in purified form or expressed from a different 
construct. More complex transposons carry more insertion sequences and additional 
genes that are not involved in transposition. 

Transposable nucleic acid constructs of this invention thus typically comprise, flanked 
by two inverted repeats, the target polynucleotide construct and, more preferably, a 
marker gene. In the presence of a transposase, these transposable nucleic acid constructs 
can integrate into a cloning vector in vivo or in vitro, thereby providing for targeted 
polynucleotide insertion. Alternatively, such nucleic acid constructs can be used for 
targeted integration, in the absence of a transposase, in particular strains such as 
hypermutator strains. Such transposable nucleic acid constructs also represent a 
particular object of the present application. In this regard, in a more preferred 
embodiment, the invention also relates to a transposable nucleic acid construct, wherein 
said construct comprises an origin of transfer flanked by two inverted repeats. Specific 
examples of such construct are transposons pPLl and pPAOI6, as disclosed in the 
experimental section. The transposable nucleic acid construct may further comprise an 
integrase gene and/or a marker gene. 

The inverted repeat nucleic acid sequences may be derived from the sequence of various 
transposons, or artificially created. In particular, transposable elements can be generated 
using inverted repeats obtained from transposons or transposable elements such as Tn5, 
Tn21, miniTn5, T7, T10, Tn917, miniTn400, etc. Preferably, the sequences derive from 
transposon Tn5. In a specific embodiment, they comprise all or a functional part of the 
following sequences : 

- left arm of pPAOI6 transposon (SEQ ID NO: 7) 

CTGTCTCTTATACACATCTCAACCATCATCGATGAATTTTCTCGGGTGTTCTCGCATA 
TTGGCTCGAATTCGAGCTCGGTACCC 

- right arm of transposon pPAOI6 (SEQ ID NO: 8) 

GATCCTCTAGAGTCGACCTGCAGGCATGCAAGCTTGCCAACGACTACGCACTAGCC 
AACAAGAGCTTCAGGGTTGAGATGTGTATAAGAGACAG 
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The marker gene may be any nucleic acid encoding a molecule whose presence in a cell 
can be detected or visualized. Typical marker genes encode proteins conferring 
resistance to antibiotics, such as apramycine, chloramphenicol, ampiciline, kanamycine, 
spectinomycine, thiostrepton, etc. Other types of markers confer auxotrophy or produce 
a label (e.g., galactosidase, GFP, luciferase, etc). 

In a specific embodiment, the cloning vector in the library comprises a first marker gene 
and the modification step ii) comprises: 

. contacting in vitro, in the presence of a transposase, the selected cloning vectors with a 
transposon comprising, flanked by two inverted repeats, the target polynucleotide 
construct and a second marker gene distinct from the first marker gene, and 
. selecting the cloning vectors which have acquired the second marker gene and which 
have lost the first marker gene. 

The double selection ensures that the target polynucleotide construct has been inserted at 
a site within the marker gene present in the cloning vector, i.e., outside of the 
polynucleotide insert. 

It should be understood that the modification may be accomplished in various other 
ways, particularly by incorporating a sequence coding for the transposase directly into 
the transposable element or into another expression unit. The presence and expression of 
the transposase can be regulated by inductive promoter or termosensitive replicative 
units. Also, transposition can be carried out by in vitro process. 

Analysis of the polynucleotides 

In step (iii) of the process, the polynucleotides may be analysed by various methods, 
including by genetic, biochemical, chemical or phenotypical approaches, which are well 
known per se in the art. Analysis occurs upon transfer and, optionally, expression of the 
polynucleotides into the selected host cell. 
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In this regard, the modified cloning vectors can be transferred into the selected host cell 
by a variety of techniques known in the art, including by transformation, electroporation, 
transfection, protoplast fusion, conjugative transfer, etc. In a preferred embodiment, the 
target polynucleotide construct comprises an oriT and the modified vectors are 
transferred into the selected host cells by conjugative transfer. In this embodiment, the 
cloning vector and the selected host cells are co-cultivated and the recombinant host 
cells are selected and isolated. 

The selected host cell may be any type of cell or microorganism, including, without 
limitation, Steptomyces, E. coli, Salmonella, Bacillus, Yeast, fungi, etc. 

One of the objectives of the invention is to be able to analyse environmental DNAs of 
unknown cellular origin into different host expression systems. In order to analyse the 
potentiality of the DNAs at the transcription and/or translation levels and to have much 
more probabilities to have a DNA expression, it is important to have the possibility to 
test different host expression systems. 

Insertion of foreign DNA into host expression systems like Streptomyces can produce, 
for instance, an increase in doubling time, morphological modifications, pigments 
production, etc., which can be related either directly to the expression of foreign DNA or 
by combinatorial biology of the foreign DNA and the biology of the expression host 
systems. The new phenotypes can be analysed by all techniques known in the art such 
genetic, biochemical, chemical, phenotypic approaches etc. 

In a preferred, specific embodiment, the invention relates to a method for the 
identification or cloning of polynucleotides encoding a selected phenotype, the method 
comprising (i) cloning environmental DNA fragments into E. coli cloning vectors to 
produce a metagenomic library, (ii) identifying or selecting cloning vectors in said 
library which contain DNA fragments having a particular characteristic of interest, (iii) 
modifying the identified or selected cloning vectors into shuttle or expression vectors for 
transfer and integration in a selected host cell, (iv) transferring the modified cloning 
vectors into said selected host cell and (v) identifying or cloning the DNA fragments 
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contained in said modified cloning vectors which encode said selected phenotype in said 
selected host cell. 



By applying the above method, new polynucleotide sequences have been identified, 
cloned and characterized, which produce new phenotypes in bacteria. These 
polynucleotides contain the sequence of PKS genes and other genes that encode 
polypeptides involved in biosynthetic pathways. The sequence of these polynucleotides 
is provided in SEP ID NO:l and SEP ID NO:2 ITFigure 5]]. 

The complete annotated DNA sequences of fosmid clones FS3-124 and FS3-135 are 
provided in SEP ID NPs: 1 and 2, respectively, and as further described in the 
following CDS information. 

Specifically, the following CDS information are related to SEP ID NP:1. 



CDS 76. .1134 

/note="ABC transporter" 

/gene = "TAP2 PROTEIN" 

/blastp match="Oryzias latipes" 

/blast score= 0.002 

CDS complement (1096. .2430) 

/ note = "none" 

/blastp match="Anabaena sp" 

/gene = "ALRH17 PROTEIN" 

/blast score=2e-18 

CDS 1178. .1624 

/note="Gram pos anchor" 

/gene="CELL WALL SURFACE ANCHOR" 

/blast score=le-04 

/blastp match="Streptococcus pneumoniae" 

CDS complement (2506. . 3567) 

/ note = "CONSERVED" 

/ gene = "HYPOTHETICAL PROTEIN" 

/blast score=0.019 

/blastp match="Deinococcus radiodurans " 

CDS complement (2906. .4222) 

/note="glycosyl transferase" 

/ gene = " lipopoly saccharide " 

/blast score=2e-23 

CDS complement (4092 . . 5321) 

/note="glycosyl transferase" 

/gene = "glycosyl transferase" 

/blast score=le-15 

CDS complement ( 6337 . . 8502 ) 

/note="PUTATIVE" 

/ gene = "GLUTAMINE AMIDOTRANSFERASE " 

/blastp match=" Bordetella bronchiseptica " 

/blast score=le-16 

CDS complement (8181. .9530) 

/note="none" 

/ gene="MEMBRANE PROTEIN" 

/blast score=0.035 

CDS complement (9531. .10721) 

/note="NOEC Transmembrane" 

/ gene = "NODULATION PROTEIN" 

/blastp match="Azorhizobium caulinodans" 

/blast score=3e-07 
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CDS complement (10504 . .112 74) 

/ no te=" PUTATIVE " 

/gene=" HYDROLASE" 

/blastp match="Streptomyces coelicolor" 

/blast score=4e-14 

CDS 12874 . . 13689 

/ note = "HYPOTHETICAL Meth-transf " 

/gene="PROTEIN PA1088" 

/blast score=2e-Q6 

/blastp match=" Pseudomonas aeruginosa" 

CDS 14195. . 15976 

/note="PUTATIVE, Glyco transf" 

/note-" " 

/ gene = "LI POPOLYSACCHARI DE BIOSYNTHESIS" 

/blast score=2e-06 

/blastp match="Vibr io cholerae" 

CDS 15427 , .16512 

/ no te=" PATHWAY: INNER CORE LI POPOLYSACCHARI DE 

BIOSYNTHESIS" 

/gene = "PHOSPHOHEPTOSE ISOMERASE " 

/blast score=3e-17 

/blastp match="Helicobacter pylori" 

CDS 15579. .16253 

/ note = "none" 

/gene = "PHOSPHOHEPTOSE ISOMERASE" 

/blast score=2e-22 

/blastp match="Neisseria meningitidis" 

CDS complement (16505. .17656) ' " 

/note = "BIOSYNTHESIS PUTATIVE" 

/ gene = " LI POPOLYSACCHARI DE " 

/blast score=2e-17 

/blastp match="Thermotoga maritima" 

/pf am match="Glycos transf 1" 

CDS complement (17 657 . .18697) 

/ note="none" 

/gene="ALR3073 PROTEIN" 

/blast score=6e-27 

/blastp match="Anabaena sp" 

CDS complement (18 615. .19304) 

/ note = "none " 

/gene="ALR4 487 PROTEIN" 

/blast score=8e-07 

/blastp match="Anabaena sp" 

CDS complement (19301. .20596) 

/note = "ATP GTP A" 

/gene = "ABC TRANSPORTER" 

/blast score=3e-61 

/blastp match="Synechocystis sp" 

CDS complement (20535. .21476) 

/ no te = " PERMEASE COMPONENT" 

/ gene = " POLYSACCHARI DE ABC TRANSPORTER" 

/blast score=6e-41 

/blastp match="Clostridium acetobutylicum" 

CDS complement (22 02 5. .22951) 

/ note=" involved in the synthesis of a polysaccharide 

capsule ?" 

/gene = "32.3 KDA PROTEIN" 

/blast score=3e-17 

/blastp match="Sphingomonas sp" 

CDS 23155 . .26523 

/note = "peptide syntase" 

/gene="mcyA, mcyB and mcyC" 

/blastp match="Microcystis aeruginosa" 

/blast score=0.0 

CDS 26409. . 34433 

/note="polyketide syntase et peptide syntase" 

/gene = "mcyD, mcyE, mcyF and mcyG" 

/blastp match="Microcystis aeruginosa" 

CDS 34418. . 37500 

/ note = "CYSTATIN" 
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/gene = "PEPTIDE SYNTHETASE" 

/blast score=0.0 

/blastp match="Anabaena sp" 

CDS 35359. . 37500 

/note="gene cluster" 

/ gene="nostopeptolide biosynthetic" 

/blast score=2e-41 

/blastp match="Nostoc sp" 

Sequence 37500 BP; 6199 A; 12698 C; 12769 G; 5834 T; 0 other; 



The following CDS information are related to SEP ID NO:2. 



CDS complement (3 . .914) 

/blast score=2e-66 

/blastp match="AE004644 . PA2177 Pseudomonas aeruginosa" 

/gene="regulator hybrid" 

/note="probable similarity to prokaryote sensory 

transduction proteins" 

/product="sensor/response regulator hybrid" 

CDS 924 . .2168 

/note="none " 

/gene="ligase" 

/blastp match="AP003013 .MLR8297 Mesorhi zobium loti" 

/blast score=e-152 

/product="2-amino-3-ketobutyrate CoA ligase" 

CDS 2207 . .3190 

/blast score=e-151 

/blastp match= "AE 0 0 8 87 2 . TDH Salmonella typhimurium" 

/ note="none " 

/gene= "dehydrogenase" 

/product^" threonine 3 -dehydrogenase " 

/pfam match="PF00107; adh zinc; 1" 

CDS 3373 . .4455 

/ note= "putative " 

/ gene= "methyl t ran sf erase" 

/blastp match="AE001866 . DR0026 Deinococcus 

radiodurans " 

/blast score=le-08 

CDS 4546. .4959 

/blast score=2e-ll 

/blastp match="AP002997 .MLL1617 Mesorhi zobium loti" 

/gene= "unknown" 

/note="pfam00263, GSP11 111, Bacterial type 11 and 111 

secretion system protein, Expect - 7.8" 

CDS 5176. . 6192 

/blast score=5e-98 

/blastp match="AF064070 . PE20 Burkholderia 

pseudomallei " 

/gene="glucose epimerase" 

/note="putative " 

/product="UDP-glucose 4 -epimerase" 

/pfam match="PF01370; Epimerase; 1" 

CDS 6331 . . 14043 

/note=" subst rat AT, malonyl ; zinc depend 

dehydrogenase ; 

zinc dependent adenosine deaminase putative" 

/gene="PKS 1" 

/blastp match="AF285636 .WCBR 2547 Burkholderia 

mallei " 
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CDS 



CDS 



CDS 



CDS 



CDS 



CDS 



CDS 



tume f aciens ' 
CDS 



tumef aciens " 



UNKNOWN 1" 
CDS 



/blast score=0 . 0 



/pfam match="ketoacyl-synt " 



/pfam match="ketoacyl-synt C" 



/pfam match="Acyl transf" 



/pfam match="SAM binding" 



/pfam match="adh zinc" 



/pfam match="pp-binding" 



14275 . .15408 



/blast score=e-104 



/blastp match="AF285636 . WCBT Burkholderia mallei" 
/gene="acyl-CoA transferase WcbT" 



/no te= "putative" 



/pfam match="PF00155; aminotran 1 2; 1" 



15436. .16245 



/blast score=5e-ll 



/blastp match="TTDEFFMT . FMT T . thermophi lus " 
/ gene=" formyl trans f erase " 



/note="evidence experimental" 



/product="methionyl-tRNA formyl trans f erase " 
/pfam match="PF00551; formyl transf; 1" 
/pfam match="PF02911; formyl transf C; 1" 
16287 . .17384 



/ note= "putative" 



/gene="glyco transf erase" 



/blastp match="AF285636 . WCBD Burkholderia mallei" 
/blast score=e-99 



17427 . .18158 



/blast score=2e-82 



/blastp match="AF285636 . WZM Burkholderia mallei" 
transporter Wzm" 



/ note="putative' 



/pfam match="PF01061; ABC 2 membrane; 1" 
18248 . .18847 



/blast score=7e-61 



/blastp match="AF285636 . WZT Burkholderia mallei" 
/gene="ABC-2 transporter Wzt" 



/ note="putative 



/pfam match="PF00005; ABC tran; 1" 



18952 . .20346 



/ note= "putative" 



/gene="glycosyltranf erase" 



/blast score=e-101 



/blastp match="AF285636 . WCBE Burkholderia mallei" 
/pfam match="Glycos transf 1" 



20442 . .21167 



/note= "putative " 



/gene="unknow" 



/blast score=le-26 



/blastp match="AE009248 . ATU3189 Agrobacterium 
complement (21164. .24301) 



/note=" tranporter domain" 



/gene="unknow' 



/blast score=5e-29 



/blastp match="AE009122 . ATU1658 Agrobacterium 
/prosite match="PS00402 ; BPD TRANSP INN MEMBR; 
complement (24351. .27023) 



/ note="none" 



/gene="unknow" 



/blast score=5e-29 
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/blastp match="AP003581 . ALR0267 Nostoc sp" 

CDS complement (27806. .29686) 

/ note="none " 



acetivorans 1 



/gene="cell surfarce protein" 



/blast score=2e-34 



/blastp match="AE010748 .MA0851 2567 Methanosarcina 
acetivorans " 



/product="cell surface protein' 



CDS complement (29535. .30872) 

/note="none' 



/gene="cell surface protein' 



/blast score=2e-36 



/blastp match="AE010748 .MA0851 Methanosarcina 



CDS complement (30848. .32647) 

/note=" fragment" 

/ gene="0-ant igen" 
/blast score=le-72 



/blastp match="AF105 0 60 . RFBC Riftia pachyptila" 
/product="Q-antigen biosynthesis protein" 
/pfam match="PF00535; Glycos transf 2" 



CDS complement (32574. .35555) 





/ no te= "putative" 






/gene="glycosyl transf erase" 






/blast score=7e-46 






/blastp match="AE013462 .MM2213 


Methanosarcina mazei " 


CDS 


complement ( 35533 . .36598) 






/blast score=7e-37 






/blastp match="AE013462 .MM2213 


Methanosarcina mazei" 




/ gene= "gly co syl transf erase" 






/ note="putative" 




CDS 


complement (36516. .37400) 





/blast score=8e-22 



/blastp match="AP003581 .ALR0267 Nostoc sp" 
/gene="ALR0267' 



/note="putative ATP/GTP-binding protein, esterase fush 
9e-14" 



Sequence 37507 BP; 5531 A; 13507 C; 13011 G; 5458 T; 0 other; 

The invention also relates to any polynucleotide sequence comprising all or part of these 
sequences (i.e., SEQ ID NOs: 1 or 2), their complementary strand, or a functional 
variant thereof. A part of the above sequences includes, preferably, at least 20 
consecutive bases, more preferably at least 50 consecutive bases thereof, even more 
preferably a coding sequence (e.g., an CDS). In this respect, SEQ ID NOs: 1 and 2 
comprise several novel open reading frames encoding novel polypeptides involved in 
biosynthetic pathways. These coding sequences are identified above in Figures 5a and 
5b. 



In a specific embodiment, the invention relates to a polynucleotide sequence comprising 
a sequence selected from nucleotides (CDS) 76 - 1134 ; 1096 - 2430 ; 1178 - 1624 ; 
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2506 - 3567 ; 2906 - 4222 ; 4092 - 5321 ; 6337 - 8502 ; 8181 - 9530 ; 9531 - 10721 ; 
10504 - 11274 ; 12874 - 13689 ; 14195 - 15976 ; 15427 - 16512 ; 15579 - 16253 ; 16505 

- 17656 ; 17657 - 18697 ; 18615 - 19304 ; 19301 - 20596 ; 20535 - 21476 ; 22025 - 
22951 ; 23155 - 26523 ; 26409 - 34433 ; 34418 - 37500 and 35359 - 37500 of SEQ ID 
NO: 1 [[(figure 5a)]] or a complementary strand thereof. 

In an other specific embodiment, the invention relates to a polynucleotide sequence 
comprising a sequence selected from nucleotides 3 - 914, 924 - 2168 ; 2207 - 3190 ; 
3373 _ 4455 ; 4546 - 4959 ; 5176 - 6192 ; 6331 - 14043 ; 14275 - 15408 ; 15436 - 16245 
; 16287 - 17384 ; 17427 - 18158 ; 18248 - 18847 ; 18952 - 20346 ; 20442 - 21167 ; 
21164 - 24301 ; 24351 - 27023 ; 27806 - 29686 ; 29535 - 30872 ; 30848 - 32647 ; 32574 

- 35555 ; 35533 - 36598 and 36516 - 37400 of SEQ ID NO: 2 [[(figure 5b)]] or a 
complementary strand thereof. 

Variants of these sequences include any naturally-occurring variant comprising or or 
several nucleotide substitutions ; sequences variants resulting from the degeneracy of the 
genetic code, as well as synthetic variants coding for functional polypeptides. Variants 
include any sequence that hybridise under high stringent conditions, as disclosed for 
instance in Sambrook et al., to any of the above sequences, and encode a functional 
polypeptide. The invention also include any nucleic acid molecule encoding a 
polypeptide comprising all or a fragment of an amino acid sequence encoded by a 
polynucleotide as disclosed above. Preferably, the fragment comprises at least 10 
consecutive amino acid residues, more preferably at least 20, even more preferably at 
least 30. 

The invention also relates to any vector comprising these polynucleotide sequences. 
These sequences may be DNA or RNA, preferably DNA, even more preferably double- 
stranded DNA. The invention also relates to a polypeptide encoded by a polynucleotide 
sequence as defined above. The invention also relates to a method of producing such 
polypeptides by recombinant techniques, comprising expressing a polynucleotide as 
defined above in any suitable host cell and recovering the encoded polypeptide. The 
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invention also relates to a recombinant host cell comprising a polynucleotide or a vector 
as defined above. 

An other object of this invention resides in a library of polynucleotides, wherein said 
library comprises a plurality of environmental DNA fragments cloned into cloning 
vectors, wherein said environmental DNA fragments contain a common molecular 
characteristic and wherein said cloning vectors are E. coli cloning vectors comprising a 
target polynucleotide construct allowing transfer and integration of the environmental 
DNA into the genome of a selected host cell distinct from E. coli. 

The sub-DNA-libraries should have either desired genetic characteristics based on high 
or low GC content, DNA encoded for a desired enzymatic activity, part or full 
biosynthetic pathways for metabolites etc., or specific origin such as soil fractions, 
animal organs, sub fraction of a microorganism community etc. The invention also 
allows to produce conjugative vector with desired characteristics in accordance with the 
characteristics of the pre-identified sub-DNA-libraries and functional analysis of 
mutants in heterologous hosts. It can also be used, without limitation, for the production 
of mutants by mutagenesis, for DNA sequencing, genes or biosynthetic pathways knock- 
out by insertion or to confer transfer capabilities for expression, co-expression, over- 
expression or modification of biosynthetic pathways. 

Further aspects and advantages of the invention will be disclosed in the following 
examples, which should be regarded as illustrative and not limiting the scope of this 
application. 

Experimental Section 

A - From E.coli to Strevtomyces 



In this work, we constructed a fosmid library in E. coli from total DNA prepared directly 
from soil. The library has been screened for presence of biosynthetic pathways. We 
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developed genetic tools for functional genomics that allow gene identification, 
inactivation and horizontal gene transfer from E. coli to Streptomyces . 

The cloning vectors in the library contain ColEl replicon for propagation in E. coli. 
Transposable elements based on Tn 5 transposon were produced and used for in vitro 
modification of selected cloning vectors. Integrated transposable elements contain gene 
for resistance to apramicyne. Conjugative derivatives were constructed by incorporating 
origin of transfer from plasmid RP4. A conjugative and site specific integrative 
transposon was also constructed comprising the integrase gene from cj) C3 1 phage, 
including attP attachement site. Conjugal transfer was demonstrated from an appropriate 
E. coli donor cell to another E. coli or Streptomyces lividans recipient cell. Constructed 
transposon was tested for inactivation of the genes cloned into fosmids. Obtained 
mutants can be used for direct sequencing by adequate primers and transferred by 
conjugation into Streptomyces lividans. Transposable elements thus represent very 
useful tools for functional analysis of a large DNA libraries cloned into BAC, PAC 
fosmids or other cloning vectors in which cloned inserts must be transferred into 
heterologous host. 

Materials and methods 

Bacterial strains, plasmids and growth conditions. 

E. coli DH10B (F- mcrA delta(mrr-hsdRMS-mcrBC) phi80dlacZ deltaM15 delta lacX74 
deoR recAl endAl araD139 delta (ara, leu)7697 galU galK lambda- rpsL nupG), strain 
(Epicentre) was used for fosmid and plasmid transformation and DNA amplification. 
Unless specifically described, all DNA manipulations were performed according to 
Sambrook, J., et ah (1989). 

Soil DNA extraction and DNA libraries construction. 

Total bacterial community DNA was extracted and large DNA libraries have been 
constructed into fosmids according to the method described in WO 01/81357. 

Fosmid DNA extraction and Purification 
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Fosmids DNA containing soil librairy were extracted from pools of 96 clones. Culture of 
recombinant clones were performed in Deep-Well 96 and 48, respectively in 1 ml and 2 
ml of LB media containing 12,5 jag ml" 1 of chloramphenicol. Cultures were grown at 
37°C with shaking at 250 RPM during 22 hours. DNA extraction was done by using the 
Nucleobond PC 100 extraction kit (Macherey Nagel). 

PCR screening for the detection of PKS genes 
Primers design 

Degenerate PCR primers sets were designed to specifically amplify PKS nucleic acids 
sequences. Multiple sequence alignment of PKS domains revealed highly conserved 
motives, in particular in the P-keto acyl synthase domain. Primers Lib IF and Lib2R 
were designed in conserved motives of the P-keto acyl synthase gene. Lib IF (sense 
primer, 5'- GGSCCSKCSSTSDCSRTSGAYACSGC -3') and Lib2R (antisense primer, 
5'- GCBBSSRYYTCDATSGGRTCSCC -3') were deduced from p-keto acyl synthase 
peptide sequences GP(AS)(LV)(AST)(IV)DTAC and GDPIE(TVA)(RAQ)A, 
respectively. The specific fragment amplified with Lib IF / Lib2R was approximately 
about 465 bp (corresponding to 155 amino acids). Specificity and efficiency of the PCR 
systems were validated by testing on positive DNA controls (i.e. genomic DNA from 
type I PKS producing strain such as Bacillus subtilis, Streptomyces lividans, 
Streptomyces ambofaciens and Ralstonia solanocearum) and negative DNA controls 
(genomic DNA from strains which are known to do not contain PKS genes). 
Furthermore, DNA extracted from soil samples were tested to calibrate PCR techniques. 

PCR conditions 

PCR conditions were optimised, in particular for concentrations of DMSO, MgCb, 
Primers and DNA template quantities. For PCR using microorganism genomic DNA and 
soil DNA as template (50 to 200 ng), the PCR mix (50^1) contained 250^iM of dNTP, 5 
mM MgCl 2 final, 2,5% DMSO, IX PCR buffer, 0,75 juM of each primer and 2,5 U of 
Taq DNA polymerase (Sigma) and sterile distilled water. For PCR using fosmid pooled 
DNA as template (100 to 500 ng), the PCR mix (50^1) contained 250^iM of dNTP, 5 
mM MgCl 2 final, 5% DMSO, IX PCR buffer, 0,75 juM of each primer and 2,5 U of Taq 
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DNA polymerase (Sigma) and sterile distilled water. For identification of positives 
clones in 96 microtiter plates, 25 pi of each bacteria culture were used as template and 
PCR conditions were the same as above. Thermocycling program was : a denaturation 
step at 96°C for 5 minutes; then 1 minute at 96°C, 65°C for 1 minute, 72°C for 1 minute. 
The first 7 cycles, the annealing temperature was lowered 1°C per cycle until 58°C was 
reached. A subsequent 40 cycles were carried out with the annealing temperature at 
58°C. A final extension step was at 72°C for 7 minutes. For identification of positives 
clones in 96 microtiter plates, the first denaturation step of 96°C was during 8 minutes. 
The other steps were the same as described above. PCR reactions were performed with a 
PTC 200 thermocycler (MJ Research). 

PCR products analysis 

PCR products of about 465 bp were purified on agarose gel with gel extraction Kit 
(Qiagen) according to the manufacturer recommendations. First approach consisted in 
subcloning PCR products using the Topo PCR II kit (Invitrogen). Recombinant Plasmids 
were extracted using QIAprep plasmid extraction Kit (Qiagen) and sequenced with 
Forward and Reverse Ml 3 primers with CEQ 2000 automated sequencer (Beckman 
Coulter). Second approach consisted in direct sequencing of PCR products. Sequencing 
data were compared with nucleic and proteic genbank database using BLAST program. 

Sequencing of the identified fosmid insert DNA and sequence analysis 

Fosmids inserts were sequenced using either a transposon-mediated and by shotgun 
subcloning approach. Transposition was realized by using (Transposition Kit) 
commercialized by Epicentre according to the manufacturer. For shotgun subcloning, 
transformants were grown for 16 hours at 37°C. Fosmid extraction was done by using 
the Nucleobond PC 100 extraction kit (Macherey Nagel). DNA was partially restricted 
with Sau3A and sized on standard gel elecrtphoresis for fragments ranging from 1 to 3 
Kbs and cloned into Bluescript vector according to Sambrook et al. (1989). 
Sequence analysis was performed with the identification of ORFs by using Frameplot of 
the GC3. Each identified ORF was compared to gene databases by using BLAST 
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program. PKS domains were determined by aligning obtained sequence versus already 
described PKS domains from domain databases. 

Sequencing 

Sequencing reactions were performed with 1 jug of DNA and 3.6 pmol of primer, using 
CEQ 2000 Dye Terminator Cycle Sequencing kit (Beckman Coulter) under conditions 
proposed by supplier. Ten juL of reaction products were precipitated using 4 juL of 
solution containing 1.5 M NaOAc, 50 mM EDTA and 60 juL cold 95% ethanol/dH 2 0 
from -20 °C. The pellet was washed 2 times with 200 ]uL 70% ethanol/dH 2 0, vacuum 
dried and dissolved in 40 ]uL sample loading solution (supplied in kit). Sequencing 
reactions were run on an CEQ 2000 sequencer (Beckman Coulter). 

Plasmids construction and validation 

Plasmid pPl was constructed as follows. A 941 bp DNA fragment containing native 
promoter region and AA(3)IV gene was amplified by polymerase chain reaction (PCR) 
using primers AmF ( d-CCCTAAGATCTGGTTCATGTGCAGCTCCATC, SEQ ID 
NO: 9) and AmR ( d-TAGTACCCGGGGATCCAACGTCATCTCGTTCTCC, SEQ ID 
NO: 10). One hundred microlitre reaction were performed containing O.ljuM each of 
primers, 1 X Vent DNA polymerase buffer (NEB), 0.2 jiM of each deoxyribonucleoside 
triphosphate (dNTP), 50 ng of the DNA template and 2U of Vent DNA polymerase 
(NEB). PCR mixture was heated for 4 min at 94 °C in a PTC-200 thermocycler (Peltier) 
and cycled 25 x at 94 °C for 60 sec, 59 °C for 30 sec and at 72 °C for 70 sec. The final 
extension was performed at 72 °C for 7 min. 

PCR product was purified using GFX DNA purification kit (Amersham), then digested 
by Bgl II and Sma I restriction enzymes. A 941 bp Bglll/Smal fragment was inserted 
into the Bam HI, Sma I sites of pMOD plasmid (EPICENTRE). DH10B E. coli was 
transformed with pPl and subjected to apramicyne selection on LB agar plates. Six 
colonies surviving on apramicyne selection were grown in liquid LB media and final 
pPLl candidates were thoroughly checked via PCR and restriction mapping. 
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To construct conjugative plasmid pPLl, 750 bp oriT DNA region from plasmid RP4 
were amplified via PCR using primers oriTF (d- 
GCGGTAGATCTGTGATGTACTTCACCAGCTCC, SEQ ID NO: 11) and oriTR 
(TAGTACCCGGGGATCCGACGGATCTTTTCCGCTGCAT, SEQ ID NO: 12). PCR 
conditions were as above. Amplified DNA was digested using Bgl II and Sma\ 
restriction enzymes. BgUVSmal DNA fragment was subjected to purifiation after gel 
electophoresis on 0.7% agarose. Purified fragment was ligated into pPl plasmid digested 
by Bam HI and Sma I restriction enzymes. 

(() C31 integrase gene and attachment site (attP) was amplified via PCR using primers 
Fint (d-AACAAAGATCTCCCGTACTGACGGACACACCG, SEQ ID NO: 13) and PJ 
(d-CGGGTGTCTCGCATCGCCGCT, SEQ ID NO: 14). Amplified DNA fragment was 
purified by GFX kit (Amersham) and phosphorilated using T4 polynucleotide kinase 
(NEB) under conditions recommended by the enzyme manufacturer. Phosphorilated 
DNA fragment was cloned in to pPLl vector opened with Smal restriction enzyme 
(NEB) and dephosphorilated by calf alkaline phosphatase (NEB). DH10B E coli was 
transformed with ligation mixture using Bio Rad Pulsing apparatus and protocols 
provided by Bio-Rad. Twelve transformants were analyzed by PCR for the presence of 
integrase gene. Orientation of integrase gene was verified by restriction analysis using 
Bgl II and EcoKl restriction enzymes. Resulting plasmid was named pPAOI6. 
To construct pPAOI6-A plasmid, pPAOI6 plasmid was digested with EcoKl and Bglll 
restriction enzymes followed bydigestion by Bean mung nuclease (NEB). Linearised 
plasmid was self ligated and transformed in to DH10B cells. 

Plasmid preparation of fosmid DNA 

Fosmid and BAC DNA for sequencing was prepared by using the Nucleobond AX kit 
(Macherey-Nagel), following protocol for BACs, Cosmid as specified by manufacturer. 

Mutagenesis 

Transposon Tn-pPAOI6 was prepared by digestion of pPAOI6 plasmid using Pvull 
restriction enzymes, followed separation on agarose gel and purification of fragment 
containing transposon from gel using Qiagen kit. The same molar ratio of transposon 
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and corresponding fosmid was used for mutagenesis in vitro using Tn5 transposase 
(Epicentre) and conditions specified by manufacturer. We transformed aliquot of the 
transposed mixture by electroporation into competent DH10B E. coli strain. 

Conjugation E. coli- Streptomyces lividans TK24 

Conjugation experiments were done using 6 x 10 6 E coli SI 7.1 cells containing 
conjugative plasmids or fosmids. The E coli cells were grown in LB media with adequat 
antibiotic. The cells were collected by centrifugation, washed two times using same 
volume of LB media and concentrated to 10 8 cells/ml and overlaid on LB plates 
containing 2x 10 6 pregeminated Streptomyces lividans TK24 spores. The cells mixture 
were grown over night at 30° C and E. coli cells were washed three times using 2 ml of 
LB media. The plates were overlayed using top agar containing NAL (nalidixic acid) 
and the appropriated antibiotic. Plates were incubated for 4 days at 30°C and 
transformant streptomyces colonies were isolated on HT medium (Pridham et al. 1957) 
containing NAL and the same appropriated antibiotic. 

Results 

Construction of a Transposon Tn <Apra>. 

E. coli aminoglycoside -(3)- acetyl transferase IV gene (aa(3)IV) was amplified by PCR 
and cloned in pMOD vector (Epicentre) . Advantage of this selective marker allow 
positive selection in E. coli and in Streptomyces lividans. Transposon can be used for 
insertional inactivation in vitro using purified transposase Tn5 (Epicentre). The structure 
of the pPl constructed vector and transposon was shown on Figure la, lb. 

Construction of a conjugative plasmid-transposon. 

Conjugative vector, transposon was constructed by cloning origin of transfer from 
plasmid RP4 into pPl vector producing pPLl vector (Figure 2a, 2b). The origin of 
transfer was cloned in such orientation that the selective aa(3)IV gene is the last 
transferred during conjugation. PPLl vector was introduced in to specific E. coli SI 7.1 
strain that carry RP4 plasmid integrated in to chromosome. In conjugation experiment 
between donor strain S17.1 carrying pPLl plasmid and DH10 E coli receptor we 
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obtained DH10B strain carrying pPLl plasmid. This data shows that cloned oriT 
fragment is functional in the pPLl plasmid. Plasmid pPLl can be used for DNA cloning, 
gene inactivation by homologous recombination. Cloned genes or part of the gene 
cloned could be then transferred by conjugation in to another host. Another advantage of 
this vector is conjugative transposon that can be excised from vector and inserted 
randomly in vivo in to another DNA molecule by purified Tn 5 transposase. 

Construction of a conjugative, site specific integrative plasmid- transposon for 
horizontal gene transfer between E. coli and Streptomyces strains. 

PPLl plasmid was used to clone an integrase gene from phage (|)C31, resulting a plasmid 
pPAOI6 (Figure 3a, 3b). We tested several clones for horizontal transfer between E. coli 
SI 7.1 strain and Streptomyces lividans TK24 strain. The best transfer was obtained for 
plasmid pPAOI6 were orientation of integrase gene is in opposite orientation to the gene 
for resistance to apramycine. Conjugative transfer of pPAOI6 gene in to S. lividans 
strain is confirmed by resistance to apramycine or G418. Additional confirmation of 
transfer was obtained using PCR method. We were able to amplify 2 kb insert using 
specific primers for the (|)C3 1 integrase gene, and no PCR amplification was obtained for 
control S. lividans TK24 strain. 

pPAOI6 transposon was cloned into the EcoKV site of plasmid pGPS3 (New England 
Biolabs). This construction allows transposition not only by transposase Tn5 but also 
using Transposase ABC (New England Biolabs). Resulting plasmid pTn5-7AOI is 
shown on Figure [[10]]9. 

The goal of these constructions was to produce transposons that is further used in 
functional analysis of the metagenomic DNA library from the soil that were constructed 
in a laboratory (Figures 4a, 4b, 4c). 

Functional analysis of the metagenomic DNA library from soil. 

The fosmids library consists of 120 512 clones, containing ~ 40 kb inserts of soil DNA. 
The library contains approximately 4.8 Gbps of the DNA cloned from soil. Ten percents 
of the library was screened by using a PCR approach for the presence of the genes 
involved in production of secondary metabolites (PKS). Using gene-module specific set 
of primers we were able to identified positive clones organized in microtiter plates (96 
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wells). Sequences (based on PCR products) obtained from fifteen randomly positive 
clones indicate that the DNA library contains very little sequence redundancy limited to 
one and that the sequences were found to be new and very diverse in comparison to gene 
databases (data not schown). 

Two fosmids DNA was prepared from two positive clones (FS3-124 and FS3-135) and 
analyzed by sequencing. DNA analysis in silico shows high G+C contents of 72% and 
69% respectively of the cloned inserts and presence of cluster genes that could be 
involved in biosynthesis of secondary metabolites (Fig 5 a and b) . No specific 
phenotype was observed for the two clones in E coli. We employed pPAOI6 transposon 
mutagenesis to produce conjugative mutants. Transposon mutants FS3-124::pPAOI6 and 
FS3-135::pPAOI6 were isolated using apramycine as selective antibiotic. Obtained 
mutants were then tested on LB pates containing chloramphenicol. About 1% of the 
tested clones are chloramphenicol sensitive. These clones contain transposon inserted 
into locus encoding chloramphenicol resistance gene and not into cloned DNA insert. 
ApraR and ChloS transposon mutants are then used for horizontal gene transfer into 
Streptomyces lividans TK24 strain. Fosmid DNA was prepared from mutants and 
transformed into E. coli SI 7.1 strain. Horizontal gene transfer between E. coli SI 7.1 
and Streptomyces lividans was done due to inserted pPAOI6 transposon. 
Transconjugants of the Streptomyces lividans were tested by PCR to confirm gene 
transfer and integration of the conjugative fosmid in to S. lividans chromosome. Both 
transconjugants showed an increase in doubling time, morphological modifications and 
pigments production in comparison to the control (Figure [[6]] 5). 

B - From E.coli to Bacillus subtilis 

Plasmid pPSB was constructed as follows : A part of amyE gene from B subtilis was 
amplified by PCR using primers amyE-BamHI : atcgcaggatcctgaggactctcgaacccg (SEQ 
ID NO: 15) and amyE-EcoRI : cgactgaattcagatctagcgtgtaaattccgtctgc (SEQ ID NO: 16). 
DNA fragment was digested by EcoRl and BamHl restriction enzymes and ligated into 
EcoRI, Bglll site of pPAOI6 plasmid. Transposable element contains Tn<amyE-int 
(|)C3 l-oriT-apra> is shown on figure [[7]]6 with plasmid pPSB. 
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Plasmid pPSBery (Figure [[8]]7) was constructed by cloning erm AM gene into pPSB 
plasmid. A 1140 bp Sau3A DNA fragment containing ery AM gene with his own 
promoter was cloned from plasmid pMUTIN (Wagner et al. 1998) into the Bam HI site 
of plasmid pPSB. Orientation of ery gene was confirmed by sequencing. Transposable 
element contains Tn < amyE-int (|)C31-ery-oriT-apra> (Figure [[8]]7). 

Plasmid pPSBery-AI was obtained after Sma I digestion and self-ligation of core 
plasmid (Figure [[9]]8). In this construction, c()C3 1 integrase gene was deleted from 
transposable element. New transposon is Tn<amyE-ery-oriT-apra> (Figure [[9]]8). All 
transposons could be released as linear DNA by PvuII digestion from plasmids 
mentioned above and transposed by transposase Tn 5 (Epicentre) in vitro. 

The selection of transposed elements was done using 100 jug/ml erythromycine or 
40jug/ml apramycine in E coli or 0.3 jug/ml erythromycine in B subtilis. DNA was 
transformed by electro-transformation into electrocompetent E coli strains or by 
competence into B. subtilis. Integration of imported DNA into amy E locus of B. subtilis 
chromosome was confirmed using pPSBery-AI plasmid. Integration was confirmed by 
PCR using plasmid-specific and amyE locus-specific primers. Fifteen eryR B. subtilis 
clones were tested by PCR. All transformants showed integration at amyE locus of B 
subtilis chromosome, confirming the functionality of the method and constructs. 
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